1 Introduction

The primary objective of this project is to identify and visualize meaningful trends in US car accidents from 2016 to 2023. Our goal is to uncover when accidents occur most frequently, whether by time of day, day of the week, or month of the year, and determine if specific holidays are associated with increased accident rates. We also aim to explore geographic trends by identifying which states experience the highest and lowest number of accidents and assessing whether environmental factors such as weather, visibility, or road conditions contribute to accident severity. Additionally, we will examine long term trends in accident frequency to understand how they have changed over the years. By presenting our findings through a series of targeted visualizations, we hope to provide insights that could be valuable for public safety efforts, transportation planning, or future academic research.

2 Dataset Description

This analysis uses the U.S. Accidents (2016–2023) dataset compiled by Sobhan Moosavi, which is publicly available on Kaggle. The dataset contains over 7.5 million records of traffic accidents that occurred in the United States between February 2016 and March 2023.

2.1 Source

2.2 Description

Each row in the dataset represents a single traffic accident and contains information collected from traffic cameras, sensors, police reports, and other public sources. The data includes: - Timestamp and location - Weather conditions - Traffic and visibility indicators - Accident severity (rated 1 to 4)

2.3 Key Features Used in This Analysis

The following variables were selected or engineered for this project: - Severity: Level of accident seriousness (1 = least severe, 4 = most severe) - Start_Time: Timestamp of when the accident began - Temperature(F), Precipitation(in), Wind_Chill(F), Visibility(mi): Weather-related variables - Weather_Condition: Categorical weather label (e.g., Clear, Rain, Fog) - State: Abbreviation of the U.S. state - date_, hour, month, day_of_week: Time-based features derived from Start_Time - holiday_specific: Boolean indicator for U.S. holidays (e.g., Memorial Day, Christmas)

These features were used to explore patterns in accident frequency and severity across time, weather, and holidays.

## spc_tbl_ [7,728,394 × 46] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ ID                   : chr [1:7728394] "A-1" "A-2" "A-3" "A-4" ...
##  $ Source               : chr [1:7728394] "Source2" "Source2" "Source2" "Source2" ...
##  $ Severity             : num [1:7728394] 3 2 2 3 2 3 2 3 2 3 ...
##  $ Start_Time           : POSIXct[1:7728394], format: "2016-02-08 05:46:00" "2016-02-08 06:07:59" ...
##  $ End_Time             : POSIXct[1:7728394], format: "2016-02-08 11:00:00" "2016-02-08 06:37:59" ...
##  $ Start_Lat            : num [1:7728394] 39.9 39.9 39.1 39.7 39.6 ...
##  $ Start_Lng            : num [1:7728394] -84.1 -82.8 -84 -84.2 -84.2 ...
##  $ End_Lat              : num [1:7728394] NA NA NA NA NA NA NA NA NA NA ...
##  $ End_Lng              : num [1:7728394] NA NA NA NA NA NA NA NA NA NA ...
##  $ Distance(mi)         : num [1:7728394] 0.01 0.01 0.01 0.01 0.01 0.01 0 0.01 0 0.01 ...
##  $ Description          : chr [1:7728394] "Right lane blocked due to accident on I-70 Eastbound at Exit 41 OH-235 State Route 4." "Accident on Brice Rd at Tussing Rd. Expect delays." "Accident on OH-32 State Route 32 Westbound at Dela Palma Rd. Expect delays." "Accident on I-75 Southbound at Exits 52 52B US-35. Expect delays." ...
##  $ Street               : chr [1:7728394] "I-70 E" "Brice Rd" "State Route 32" "I-75 S" ...
##  $ City                 : chr [1:7728394] "Dayton" "Reynoldsburg" "Williamsburg" "Dayton" ...
##  $ County               : chr [1:7728394] "Montgomery" "Franklin" "Clermont" "Montgomery" ...
##  $ State                : chr [1:7728394] "OH" "OH" "OH" "OH" ...
##  $ Zipcode              : chr [1:7728394] "45424" "43068-3402" "45176" "45417" ...
##  $ Country              : chr [1:7728394] "US" "US" "US" "US" ...
##  $ Timezone             : chr [1:7728394] "US/Eastern" "US/Eastern" "US/Eastern" "US/Eastern" ...
##  $ Airport_Code         : chr [1:7728394] "KFFO" "KCMH" "KI69" "KDAY" ...
##  $ Weather_Timestamp    : POSIXct[1:7728394], format: "2016-02-08 05:58:00" "2016-02-08 05:51:00" ...
##  $ Temperature(F)       : num [1:7728394] 36.9 37.9 36 35.1 36 37.9 34 34 33.3 37.4 ...
##  $ Wind_Chill(F)        : num [1:7728394] NA NA 33.3 31 33.3 35.5 31 31 NA 33.8 ...
##  $ Humidity(%)          : num [1:7728394] 91 100 100 96 89 97 100 100 99 100 ...
##  $ Pressure(in)         : num [1:7728394] 29.7 29.6 29.7 29.6 29.6 ...
##  $ Visibility(mi)       : num [1:7728394] 10 10 10 9 6 7 7 7 5 3 ...
##  $ Wind_Direction       : chr [1:7728394] "Calm" "Calm" "SW" "SW" ...
##  $ Wind_Speed(mph)      : num [1:7728394] NA NA 3.5 4.6 3.5 3.5 3.5 3.5 1.2 4.6 ...
##  $ Precipitation(in)    : num [1:7728394] 0.02 0 NA NA NA 0.03 NA NA NA 0.02 ...
##  $ Weather_Condition    : chr [1:7728394] "Light Rain" "Light Rain" "Overcast" "Mostly Cloudy" ...
##  $ Amenity              : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Bump                 : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Crossing             : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Give_Way             : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Junction             : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ No_Exit              : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Railway              : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Roundabout           : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Station              : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Stop                 : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Traffic_Calming      : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Traffic_Signal       : logi [1:7728394] FALSE FALSE TRUE FALSE TRUE FALSE ...
##  $ Turning_Loop         : logi [1:7728394] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Sunrise_Sunset       : chr [1:7728394] "Night" "Night" "Night" "Night" ...
##  $ Civil_Twilight       : chr [1:7728394] "Night" "Night" "Night" "Day" ...
##  $ Nautical_Twilight    : chr [1:7728394] "Night" "Night" "Day" "Day" ...
##  $ Astronomical_Twilight: chr [1:7728394] "Night" "Day" "Day" "Day" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   ID = col_character(),
##   ..   Source = col_character(),
##   ..   Severity = col_double(),
##   ..   Start_Time = col_datetime(format = ""),
##   ..   End_Time = col_datetime(format = ""),
##   ..   Start_Lat = col_double(),
##   ..   Start_Lng = col_double(),
##   ..   End_Lat = col_double(),
##   ..   End_Lng = col_double(),
##   ..   `Distance(mi)` = col_double(),
##   ..   Description = col_character(),
##   ..   Street = col_character(),
##   ..   City = col_character(),
##   ..   County = col_character(),
##   ..   State = col_character(),
##   ..   Zipcode = col_character(),
##   ..   Country = col_character(),
##   ..   Timezone = col_character(),
##   ..   Airport_Code = col_character(),
##   ..   Weather_Timestamp = col_datetime(format = ""),
##   ..   `Temperature(F)` = col_double(),
##   ..   `Wind_Chill(F)` = col_double(),
##   ..   `Humidity(%)` = col_double(),
##   ..   `Pressure(in)` = col_double(),
##   ..   `Visibility(mi)` = col_double(),
##   ..   Wind_Direction = col_character(),
##   ..   `Wind_Speed(mph)` = col_double(),
##   ..   `Precipitation(in)` = col_double(),
##   ..   Weather_Condition = col_character(),
##   ..   Amenity = col_logical(),
##   ..   Bump = col_logical(),
##   ..   Crossing = col_logical(),
##   ..   Give_Way = col_logical(),
##   ..   Junction = col_logical(),
##   ..   No_Exit = col_logical(),
##   ..   Railway = col_logical(),
##   ..   Roundabout = col_logical(),
##   ..   Station = col_logical(),
##   ..   Stop = col_logical(),
##   ..   Traffic_Calming = col_logical(),
##   ..   Traffic_Signal = col_logical(),
##   ..   Turning_Loop = col_logical(),
##   ..   Sunrise_Sunset = col_character(),
##   ..   Civil_Twilight = col_character(),
##   ..   Nautical_Twilight = col_character(),
##   ..   Astronomical_Twilight = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr> 
## tibble [7,546,771 × 2] (S3: tbl_df/tbl/data.frame)
##  $ State     : chr [1:7546771] "OH" "OH" "OH" "OH" ...
##  $ state_name: chr [1:7546771] "Ohio" "Ohio" "Ohio" "Ohio" ...

3 Descriptive Analysis

##       ID               Source             Severity    
##  Length:7546771     Length:7546771     Min.   :1.000  
##  Class :character   Class :character   1st Qu.:2.000  
##  Mode  :character   Mode  :character   Median :2.000  
##                                        Mean   :2.212  
##                                        3rd Qu.:2.000  
##                                        Max.   :4.000  
##                                                       
##    Start_Time                        End_Time                     
##  Min.   :2016-01-14 20:18:33.00   Min.   :2016-02-08 06:37:08.00  
##  1st Qu.:2018-11-20 16:22:02.00   1st Qu.:2018-11-20 17:22:44.50  
##  Median :2020-11-10 08:23:39.00   Median :2020-11-10 15:11:14.00  
##  Mean   :2020-06-02 04:07:56.43   Mean   :2020-06-02 11:34:12.32  
##  3rd Qu.:2022-01-19 08:15:20.50   3rd Qu.:2022-01-19 19:01:21.00  
##  Max.   :2023-03-31 23:30:00.00   Max.   :2023-03-31 23:59:00.00  
##                                                                   
##    Start_Lat       Start_Lng          End_Lat           End_Lng       
##  Min.   :24.55   Min.   :-124.62   Min.   :25        Min.   :-125     
##  1st Qu.:33.38   1st Qu.:-117.22   1st Qu.:33        1st Qu.:-118     
##  Median :35.80   Median : -87.81   Median :36        Median : -88     
##  Mean   :36.19   Mean   : -94.71   Mean   :36        Mean   : -96     
##  3rd Qu.:40.11   3rd Qu.: -80.38   3rd Qu.:40        3rd Qu.: -80     
##  Max.   :49.00   Max.   : -67.11   Max.   :49        Max.   : -67     
##                                    NA's   :3341777   NA's   :3341777  
##   Distance(mi)     Description           Street              City          
##  Min.   :  0.000   Length:7546771     Length:7546771     Length:7546771    
##  1st Qu.:  0.000   Class :character   Class :character   Class :character  
##  Median :  0.028   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :  0.558                                                           
##  3rd Qu.:  0.460                                                           
##  Max.   :441.750                                                           
##                                                                            
##     County             State             Zipcode            Country         
##  Length:7546771     Length:7546771     Length:7546771     Length:7546771    
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    Timezone         Airport_Code       Weather_Timestamp               
##  Length:7546771     Length:7546771     Min.   :2016-01-14 19:51:00.00  
##  Class :character   Class :character   1st Qu.:2018-11-20 16:15:00.00  
##  Mode  :character   Mode  :character   Median :2020-11-10 08:30:00.00  
##                                        Mean   :2020-06-02 04:08:26.56  
##                                        3rd Qu.:2022-01-19 07:58:00.00  
##                                        Max.   :2023-03-31 23:53:00.00  
##                                                                        
##  Temperature(F)   Wind_Chill(F)      Humidity(%)      Pressure(in)  
##  Min.   :-58.00   Min.   :-80.0     Min.   :  1.00   Min.   : 0.00  
##  1st Qu.: 49.00   1st Qu.: 43.0     1st Qu.: 48.00   1st Qu.:29.37  
##  Median : 64.00   Median : 62.0     Median : 67.00   Median :29.86  
##  Mean   : 61.67   Mean   : 58.3     Mean   : 64.84   Mean   :29.54  
##  3rd Qu.: 76.00   3rd Qu.: 75.0     3rd Qu.: 84.00   3rd Qu.:30.03  
##  Max.   :129.20   Max.   :128.0     Max.   :100.00   Max.   :58.63  
##                   NA's   :1833858   NA's   :10278    NA's   :7904   
##  Visibility(mi)   Wind_Direction     Wind_Speed(mph)  Precipitation(in) 
##  Min.   :  0.00   Length:7546771     Min.   :   0.0   Min.   : 0.00000  
##  1st Qu.: 10.00   Class :character   1st Qu.:   4.6   1st Qu.: 0.00000  
##  Median : 10.00   Mode  :character   Median :   7.0   Median : 0.00000  
##  Mean   :  9.09                      Mean   :   7.7   Mean   : 0.00613  
##  3rd Qu.: 10.00                      3rd Qu.:  10.4   3rd Qu.: 0.00000  
##  Max.   :140.00                      Max.   :1087.0   Max.   :36.47000  
##  NA's   :39499                       NA's   :429617                     
##  Weather_Condition   Amenity           Bump          Crossing      
##  Length:7546771     Mode :logical   Mode :logical   Mode :logical  
##  Class :character   FALSE:7453817   FALSE:7543317   FALSE:6688257  
##  Mode  :character   TRUE :92954     TRUE :3454      TRUE :858514   
##                                                                    
##                                                                    
##                                                                    
##                                                                    
##   Give_Way        Junction        No_Exit         Railway       
##  Mode :logical   Mode :logical   Mode :logical   Mode :logical  
##  FALSE:7511266   FALSE:6990004   FALSE:7527521   FALSE:7481985  
##  TRUE :35505     TRUE :556767    TRUE :19250     TRUE :64786    
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##  Roundabout       Station           Stop         Traffic_Calming
##  Mode :logical   Mode :logical   Mode :logical   Mode :logical  
##  FALSE:7546527   FALSE:7348460   FALSE:7337406   FALSE:7539342  
##  TRUE :244       TRUE :198311    TRUE :209365    TRUE :7429     
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##  Traffic_Signal  Turning_Loop    Sunrise_Sunset     Civil_Twilight    
##  Mode :logical   Mode :logical   Length:7546771     Length:7546771    
##  FALSE:6424853   FALSE:7546771   Class :character   Class :character  
##  TRUE :1121918                   Mode  :character   Mode  :character  
##                                                                       
##                                                                       
##                                                                       
##                                                                       
##  Nautical_Twilight  Astronomical_Twilight     date_                year_     
##  Length:7546771     Length:7546771        Min.   :2016-01-14   Min.   :2016  
##  Class :character   Class :character      1st Qu.:2018-11-20   1st Qu.:2018  
##  Mode  :character   Mode  :character      Median :2020-11-10   Median :2020  
##                                           Mean   :2020-06-01   Mean   :2020  
##                                           3rd Qu.:2022-01-19   3rd Qu.:2022  
##                                           Max.   :2023-03-31   Max.   :2023  
##                                                                              
##      month_         hour_       any_precip          sevg          
##  Min.   : 1.0   Min.   : 0.00   Mode :logical   Length:7546771    
##  1st Qu.: 3.0   1st Qu.: 8.00   FALSE:7016077   Class :character  
##  Median : 7.0   Median :13.00   TRUE :530694    Mode  :character  
##  Mean   : 6.7   Mean   :12.33                                     
##  3rd Qu.:10.0   3rd Qu.:17.00                                     
##  Max.   :12.0   Max.   :23.00                                     
##                                                                   
##   state_name       
##  Length:7546771    
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 
## [1] 7546771      53
## tibble [7,546,771 × 53] (S3: tbl_df/tbl/data.frame)
##  $ ID                   : chr [1:7546771] "A-1" "A-2" "A-3" "A-4" ...
##  $ Source               : chr [1:7546771] "Source2" "Source2" "Source2" "Source2" ...
##  $ Severity             : num [1:7546771] 3 2 2 3 2 3 2 3 2 3 ...
##  $ Start_Time           : POSIXct[1:7546771], format: "2016-02-08 05:46:00" "2016-02-08 06:07:59" ...
##  $ End_Time             : POSIXct[1:7546771], format: "2016-02-08 11:00:00" "2016-02-08 06:37:59" ...
##  $ Start_Lat            : num [1:7546771] 39.9 39.9 39.1 39.7 39.6 ...
##  $ Start_Lng            : num [1:7546771] -84.1 -82.8 -84 -84.2 -84.2 ...
##  $ End_Lat              : num [1:7546771] NA NA NA NA NA NA NA NA NA NA ...
##  $ End_Lng              : num [1:7546771] NA NA NA NA NA NA NA NA NA NA ...
##  $ Distance(mi)         : num [1:7546771] 0.01 0.01 0.01 0.01 0.01 0.01 0 0.01 0 0.01 ...
##  $ Description          : chr [1:7546771] "Right lane blocked due to accident on I-70 Eastbound at Exit 41 OH-235 State Route 4." "Accident on Brice Rd at Tussing Rd. Expect delays." "Accident on OH-32 State Route 32 Westbound at Dela Palma Rd. Expect delays." "Accident on I-75 Southbound at Exits 52 52B US-35. Expect delays." ...
##  $ Street               : chr [1:7546771] "I-70 E" "Brice Rd" "State Route 32" "I-75 S" ...
##  $ City                 : chr [1:7546771] "Dayton" "Reynoldsburg" "Williamsburg" "Dayton" ...
##  $ County               : chr [1:7546771] "Montgomery" "Franklin" "Clermont" "Montgomery" ...
##  $ State                : chr [1:7546771] "OH" "OH" "OH" "OH" ...
##  $ Zipcode              : chr [1:7546771] "45424" "43068-3402" "45176" "45417" ...
##  $ Country              : chr [1:7546771] "US" "US" "US" "US" ...
##  $ Timezone             : chr [1:7546771] "US/Eastern" "US/Eastern" "US/Eastern" "US/Eastern" ...
##  $ Airport_Code         : chr [1:7546771] "KFFO" "KCMH" "KI69" "KDAY" ...
##  $ Weather_Timestamp    : POSIXct[1:7546771], format: "2016-02-08 05:58:00" "2016-02-08 05:51:00" ...
##  $ Temperature(F)       : num [1:7546771] 36.9 37.9 36 35.1 36 37.9 34 34 33.3 37.4 ...
##  $ Wind_Chill(F)        : num [1:7546771] NA NA 33.3 31 33.3 35.5 31 31 NA 33.8 ...
##  $ Humidity(%)          : num [1:7546771] 91 100 100 96 89 97 100 100 99 100 ...
##  $ Pressure(in)         : num [1:7546771] 29.7 29.6 29.7 29.6 29.6 ...
##  $ Visibility(mi)       : num [1:7546771] 10 10 10 9 6 7 7 7 5 3 ...
##  $ Wind_Direction       : chr [1:7546771] "Calm" "Calm" "SW" "SW" ...
##  $ Wind_Speed(mph)      : num [1:7546771] NA NA 3.5 4.6 3.5 3.5 3.5 3.5 1.2 4.6 ...
##  $ Precipitation(in)    : num [1:7546771] 0.02 0 0 0 0 0.03 0 0 0 0.02 ...
##  $ Weather_Condition    : chr [1:7546771] "Light Rain" "Light Rain" "Overcast" "Mostly Cloudy" ...
##  $ Amenity              : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Bump                 : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Crossing             : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Give_Way             : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Junction             : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ No_Exit              : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Railway              : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Roundabout           : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Station              : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Stop                 : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Traffic_Calming      : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Traffic_Signal       : logi [1:7546771] FALSE FALSE TRUE FALSE TRUE FALSE ...
##  $ Turning_Loop         : logi [1:7546771] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ Sunrise_Sunset       : chr [1:7546771] "Night" "Night" "Night" "Night" ...
##  $ Civil_Twilight       : chr [1:7546771] "Night" "Night" "Night" "Day" ...
##  $ Nautical_Twilight    : chr [1:7546771] "Night" "Night" "Day" "Day" ...
##  $ Astronomical_Twilight: chr [1:7546771] "Night" "Day" "Day" "Day" ...
##  $ date_                : Date[1:7546771], format: "2016-02-08" "2016-02-08" ...
##  $ year_                : num [1:7546771] 2016 2016 2016 2016 2016 ...
##  $ month_               : num [1:7546771] 2 2 2 2 2 2 2 2 2 2 ...
##  $ hour_                : int [1:7546771] 5 6 6 7 7 7 7 7 8 8 ...
##  $ any_precip           : logi [1:7546771] TRUE FALSE FALSE FALSE FALSE TRUE ...
##  $ sevg                 : chr [1:7546771] "more severe" "less severe" "less severe" "more severe" ...
##  $ state_name           : chr [1:7546771] "Ohio" "Ohio" "Ohio" "Ohio" ...

The raw dataset contains 7,728,394 observations (rows) of 46 variables (columns).

After data preparation and cleaning, the dataset contains 7,546,771 observations (rows) of 53 variables (columns).

Severity Number of Accidents
least severe 66121
less severe 6010987
more severe 1272321
most severe 197342

The author defines severity as “the impact on traffic.” Low severity accidents would have a minimal effect on traffic whereas high severity accidents would have a significant impact on traffic.

We can observe that the majority of accidents that took place between 2016 and 2023 were categorized as “less severe,” accounting for 6,010,987 of the total 7,546,771 accidents.

3.4 Statistical Analysis

3.4.1 Correlation Analysis of Key Quantitative Features

The heatmap shows the correlation between quantitative features such as temperature, wind chill, visibility, precipitation, and severity. Temperature and wind chill were nearly perfectly correlated (\(r = 0.99\)), as expected. However, severity had only weak correlations with all other variables, suggesting that accident severity is influenced by additional factors beyond those measured here.

3.4.2 ANOVA on Accident Severity by Weather Condition

A one-way ANOVA was conducted to examine whether accident severity differs by weather condition. The results showed a statistically significant effect of weather on accident severity, \(F(4, 1,\!814,\!823) = 18,\!549\), \(p < .001\), indicating that the average severity of accidents varies across different weather conditions.

##                  Df Sum Sq Mean Sq F value Pr(>F)    
## weather           4  18624    4656   18549 <2e-16 ***
## Residuals   1814823 455533       0                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.4.3 T-Tests on Severity and Frequency for Holidays

3.4.3.1 Severity on Specific Holidays T-Test

A Welch two-sample t-test was conducted to compare accident severity on specific holidays versus other days. The results showed a statistically significant difference in severity scores, \(t(93,\!469) = 2.50\), \(p = .0125\). The average severity on non-holidays (\(M = 2.212\)) was slightly higher than on holidays (\(M = 2.208\)), with a 95% confidence interval for the difference in means ranging from 0.0009 to 0.0073.

3.4.3.2 Frequency on Specific Holidays T-Test

A Welch two-sample t-test was also conducted to examine differences in the average number of accidents per day on holidays versus non-holidays. The results were statistically significant, \(t(43.04) = 3.27\), \(p = .0021\). The mean number of accidents per day was higher on non-holidays (\(M = 2,\!947\)) compared to holidays (\(M = 2,\!173\)), with a 95% confidence interval for the difference in means ranging from 297 to 1,!250.

## [1] "T-test on Severity (Specific Holidays):"
## 
##  Welch Two Sample t-test
## 
## data:  Severity by holiday_specific
## t = 2.4975, df = 93469, p-value = 0.01251
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
##  0.0008838382 0.0073295171
## sample estimates:
## mean in group FALSE  mean in group TRUE 
##            2.212178            2.208071
## [1] "T-test on Frequency (Specific Holidays):"
## 
##  Welch Two Sample t-test
## 
## data:  n_acc by holiday_specific
## t = 3.2727, df = 43.041, p-value = 0.002105
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
##   296.8096 1249.9020
## sample estimates:
## mean in group FALSE  mean in group TRUE 
##            2946.832            2173.476

3.4.3.3 Visualization: Severity on Holidays

Although the difference is small, the chart shows a slightly higher average severity for accidents on non-holidays compared to holidays. The mean severity was 2.212 on non-holidays and 2.208 on holidays. The corresponding Welch t-test (\(t(93,\!469) = 2.50\), \(p = .0125\)) confirms that this difference is statistically significant, although not practically large. This suggests that while there are fewer accidents on holidays, they are not necessarily more or less severe.

3.4.3.4 Visualization: Frequency of Accidents on Holidays

The bar chart clearly shows that the average number of accidents per day is significantly lower on specific holidays compared to non-holiday dates. On average, there were around 2,173 accidents per day on holidays versus 2,947 on non-holidays. This visual supports the results of the Welch two-sample t-test (\(t(43.04) = 3.27\), \(p = .0021\)), confirming that this difference is statistically significant. The lower volume on holidays may reflect reduced traffic due to time off from work and school.